How to gain speedups of 1000 on single processors with fast FEM solvers Benchmarking numerical and computational efficiency
نویسندگان
چکیده
In Computational Science and in particular in the numerical simulation of PDE problems, optimal serial performance is essential for a successful scale-out to the teraand petascale dimensions. In this paper, we propose a simple yet fundamental benchmark setting for a PDE problem that we believe any reasonably flexible Finite Element based software should be able to handle effortlessly. The Poisson problem used in these tests allows reliable performance estimates for more challenging simulations. Our performance evaluation focuses on numerical methodology and data layouts rather than implementational fine-tuning. To enable a fair and realistic comparison independent of the underlying numerical methodology, we define the metric total efficiency. Results are presented for two different solver classes, multigrid and Krylov-subspace methods, obtained in single-core computations with our solver packages Feat2 and Feast. We quantitatively emphasise the effect of different storage techniques and numbering (reordering) schemes, which constitute the crucial factor in view of the memory wall problem that ultimately determines performance of all Finite Element codes. We demonstrate a speed-up of more than a factor 1000 by migrating from a naive implementation of a standard Krylov solver to a sophisticated implementation of an advanced multigrid solver, without applying any adaptivity.
منابع مشابه
High Performance FEM Simulation in CFD and CSM
Processor technology is still dramatically advancing and promises further enormous improvements in processing data for the next decade. In contrast, much lower advances in moving data are expected such that the efficiency of many numerical software tools for Partial Differential Equations (PDEs) is restricted by the cost for memory access. In last year’s Research Report [7] we outlined the nume...
متن کاملAccelerating high-order WENO schemes using two heterogeneous GPUs
A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...
متن کاملViscous Models Comparison in Water Impact of Twin 2D Falling Wedges Simulation by Different Numerical Solvers
In this paper, symmetric water entry of twin wedges is investigated for deadrise angle of 30 degree. Three numerical simulation of a symmetric impact, considering rigid body dynamic equations of motion in two-phase flow is presented. The two-phase flow around the wedges is solved by Finite Element based on Finite Volume method (FEM-FVM) which is used in conjunction with Volume of Fluid (VOF) sc...
متن کاملFast Finite Element Method Using Multi-Step Mesh Process
This paper introduces a new method for accelerating current sluggish FEM and improving memory demand in FEM problems with high node resolution or bulky structures. Like most of the numerical methods, FEM results to a matrix equation which normally has huge dimension. Breaking the main matrix equation into several smaller size matrices, the solving procedure can be accelerated. For implementing ...
متن کاملArchitecting the finite element method pipeline for the GPU
The finite element method (FEM) is a widely employed numerical technique for approximating the solution of partial differential equations (PDEs) in various science and engineering applications. Many of these applications benefit from fast execution of the FEM pipeline. One way to accelerate the FEM pipeline is by exploiting advances in modern computational hardware, such as the many-core stream...
متن کامل